XML based Framework for ETL Processes For Relational Databases

نویسندگان

  • TASSAWAR IQBAL
  • NADEEM DAUDPOTA
چکیده

In Data Warehousing, Extraction-Transformation-Loading (ETL) are the key tasks that are responsible for the extraction of data from several sources, their cleansing, customization and insertion into data warehouse [10]. More specifically ETL tools are category of specialized tools with the task of dealing with data warehouse cleaning and loading problems. These task are very critical in every data warehouse environment, It is observed that ETL and data cleaning tools are estimated to cost at least one third of effort and expenses in the budget of the data warehouse [1,11], another evidence shows that ETL process costs 55% of the total cost of the data warehouse [1,12]. In this paper, we focus on the problem of the definition of ETL processes using xml in order to make this framework more generic and capable to deal with heterogeneous source systems. We described the framework that extract data from various heterogeneous source systems and carry it in xml files, later on data cleaning is performed using few predefined xml templates, predefined functions and ultimately data is loaded into data warehouse as per warehouse schema.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Apply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML

As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...

متن کامل

Apply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML

As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...

متن کامل

Quest: Effcient SPARQL-to-SQL for RDF and OWL

Motivation. One of the most important uses of semantic technology is that of Ontology Based Data Access (OBDA), where the objective is to use shared vocabularies and ontologies as means to access data living in possibly disperse and heterogenous data sources (e.g., relational DBMS, XML databases, spreadsheets, etc.) Today this task often involves an ETL process in which the data is (E)xtracted ...

متن کامل

A relational-XML data warehouse for data aggregation with SQL and XQuery

Integration of multiple data sources is becoming increasingly important for enterprises that cooperate closely with their partners for e-commerce. OLAP enables analysts and decision makers fast access to various materialized views from data warehouses. However, many corporations have internal business applications deployed on different platforms. No standard solution for integration exists, exc...

متن کامل

Incorporating Functions in Mappings to Facilitate the Uplift of CSV Files into RDF

Many solutions have been developed to convert non-RDF data to RDF. A common task during this conversion is applying data manipulation functions to obtain the desired output. Depending on the data format of the source to be transformed, one can rely on the underlying technology, such as RDBMS for relational databases or XQuery for XML, to manipulate data to a certain extent while generating RDF....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006